!gdown 1U89A4c4oGSx5w5-XC-Ump--O7CphcL7M
Downloading... From: https://drive.google.com/uc?id=1U89A4c4oGSx5w5-XC-Ump--O7CphcL7M To: /content/eclipse_data_enriched_5000_years.csv 100% 5.21M/5.21M [00:00<00:00, 83.1MB/s]
import pandas as pd
data = pd.read_csv("https://eclipse.gsfc.nasa.gov/eclipse_besselian_from_mysqldump2.csv")
data.head()
| year | month | day | td_ge | dt | luna_num | saros | eclipse_type | gamma | magnitude | ... | tan_f2 | tmin | tmax | etype | PNS | UNS | NCN | nSer | nSeq | nJLE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | -1999 | 6 | 12 | 03:14:51 | 46438.2 | -49456 | 5 | T | -0.27009 | 1.07329 | ... | 0.004578 | -3.0 | 3.0 | 1 | 0 | 0 | 0 | 73 | 41 | 4 |
| 1 | -1999 | 12 | 5 | 23:45:23 | 46426.5 | -49450 | 10 | A | -0.23172 | 0.93818 | ... | 0.004732 | -3.0 | 3.0 | 2 | 0 | 0 | 0 | 73 | 27 | 40 |
| 2 | -1998 | 6 | 1 | 18:09:16 | 46414.6 | -49444 | 15 | T | 0.49936 | 1.02844 | ... | 0.004573 | -3.0 | 3.0 | 1 | 1 | 0 | 0 | 75 | 32 | 20 |
| 3 | -1998 | 11 | 25 | 05:57:03 | 46402.8 | -49438 | 20 | A | -0.90454 | 0.98056 | ... | 0.004737 | -3.0 | 3.0 | 2 | -1 | 0 | 0 | 72 | 17 | 20 |
| 4 | -1997 | 4 | 22 | 13:19:56 | 46392.9 | -49433 | -13 | P | -1.46705 | 0.16108 | ... | 0.004576 | -3.0 | 3.0 | 4 | -1 | -1 | -1 | 73 | 72 | 1 |
5 rows × 54 columns
data_kaggle = pd.read_csv('/content/eclipse_data_enriched_5000_years.csv')
data_kaggle.head()
| Catalog Number | Calendar Date | Eclipse Time | Delta T (s) | Lunation Number | Saros Number | Eclipse Type | Gamma | Eclipse Magnitude | Latitude | ... | EII | Year Modulus | HEAS | Decade | Localized ESC | ESC Moving Average | ESC Wide-Scale Moving Average | Eclipse Interval | Cluster | Cluster 6 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | -1999 June 12 | 03:14:51 | 46438 | -49456 | 5 | T | -0.2701 | 0.992601 | 6.0N | ... | 0.662068 | 1999 | 0.333667 | -2000 | 1.556657 | NaN | NaN | 0.333333 | 0 | 1 |
| 1 | 2 | -1999 December 5 | 23:45:23 | 46426 | -49450 | 10 | A | -0.2317 | 0.867659 | 32.9S | ... | 0.608567 | 1999 | 0.333667 | -2000 | 1.556657 | NaN | NaN | 0.500000 | 0 | 1 |
| 2 | 3 | -1998 June 1 | 18:09:16 | 46415 | -49444 | 15 | T | 0.4994 | 0.951077 | 46.2N | ... | 0.498677 | 1998 | 0.334000 | -2000 | 1.792195 | NaN | NaN | 0.500000 | 0 | 1 |
| 3 | 4 | -1998 November 25 | 05:57:03 | 46403 | -49438 | 20 | A | -0.9045 | 0.906871 | 67.8S | ... | 0.389974 | 1998 | 0.334000 | -2000 | 1.792195 | NaN | NaN | 0.500000 | 0 | 1 |
| 4 | 5 | -1997 April 22 | 13:19:56 | 46393 | -49433 | -13 | P | -1.4670 | 0.148987 | 60.6S | ... | NaN | 1997 | 0.334333 | -2000 | 2.004286 | NaN | NaN | 0.472222 | 1 | 0 |
5 rows × 47 columns
data.columns
Index(['year', 'month', 'day', 'td_ge', 'dt', 'luna_num', 'saros',
'eclipse_type', 'gamma', 'magnitude', 'lat_ge', 'lng_ge', 'lat_dd_ge',
'lng_dd_ge', 'sun_alt', 'sun_azm', 'path_width', 'central_duration',
'duration_secs', 'cat_no', 'canon_plate', 'julian_date', 't0', 'x0',
'x1', 'x2', 'x3', 'y0', 'y1', 'y2', 'y3', 'd0', 'd1', 'd2', 'mu0',
'mu1', 'mu2', 'l10', 'l11', 'l12', 'l20', 'l21', 'l22', 'tan_f1',
'tan_f2', 'tmin', 'tmax', 'etype', 'PNS', 'UNS', 'NCN', 'nSer', 'nSeq',
'nJLE'],
dtype='object')
data_kaggle.columns
Index(['Catalog Number', 'Calendar Date', 'Eclipse Time', 'Delta T (s)',
'Lunation Number', 'Saros Number', 'Eclipse Type', 'Gamma',
'Eclipse Magnitude', 'Latitude', 'Longitude', 'Sun Altitude',
'Sun Azimuth', 'Path Width (km)', 'Central Duration', 'Date Time',
'Year', 'Month', 'Day', 'Visibility', 'Eclipse Latitude',
'Eclipse Longitude', 'obliquity', 'Geographical Hemisphere',
'Daytime/Nighttime', 'Sun Constellation', 'Inter-Eclipse Duration',
'Visibility Score', 'Eclipse Classification', 'Duration in Seconds',
'Moon Distance (km)', 'Sun Distance (km)',
'Moon Angular Diameter (degrees)', 'Sun Angular Diameter (degrees)',
'Central Duration Seconds', 'Normalized Duration',
'Normalized Path Width', 'EII', 'Year Modulus', 'HEAS', 'Decade',
'Localized ESC', 'ESC Moving Average', 'ESC Wide-Scale Moving Average',
'Eclipse Interval', 'Cluster', 'Cluster 6'],
dtype='object')
data.dtypes
year int64 month int64 day int64 td_ge object dt float64 luna_num int64 saros int64 eclipse_type object gamma float64 magnitude float64 lat_ge object lng_ge object lat_dd_ge float64 lng_dd_ge float64 sun_alt float64 sun_azm float64 path_width float64 central_duration object duration_secs float64 cat_no float64 canon_plate float64 julian_date float64 t0 float64 x0 float64 x1 float64 x2 float64 x3 float64 y0 float64 y1 float64 y2 float64 y3 float64 d0 float64 d1 float64 d2 float64 mu0 float64 mu1 float64 mu2 float64 l10 float64 l11 float64 l12 float64 l20 float64 l21 float64 l22 float64 tan_f1 float64 tan_f2 float64 tmin float64 tmax float64 etype int64 PNS int64 UNS int64 NCN int64 nSer int64 nSeq int64 nJLE int64 dtype: object
data_kaggle.dtypes
Catalog Number int64 Calendar Date object Eclipse Time object Delta T (s) int64 Lunation Number int64 Saros Number int64 Eclipse Type object Gamma float64 Eclipse Magnitude float64 Latitude object Longitude object Sun Altitude int64 Sun Azimuth int64 Path Width (km) object Central Duration object Date Time object Year int64 Month int64 Day int64 Visibility object Eclipse Latitude float64 Eclipse Longitude float64 obliquity float64 Geographical Hemisphere object Daytime/Nighttime object Sun Constellation object Inter-Eclipse Duration int64 Visibility Score float64 Eclipse Classification object Duration in Seconds float64 Moon Distance (km) float64 Sun Distance (km) float64 Moon Angular Diameter (degrees) float64 Sun Angular Diameter (degrees) float64 Central Duration Seconds float64 Normalized Duration float64 Normalized Path Width float64 EII float64 Year Modulus int64 HEAS float64 Decade int64 Localized ESC float64 ESC Moving Average float64 ESC Wide-Scale Moving Average float64 Eclipse Interval float64 Cluster int64 Cluster 6 int64 dtype: object
data.shape
(11898, 54)
data_kaggle.shape
(11898, 47)
data.isnull().sum()
year 0 month 0 day 0 td_ge 0 dt 0 luna_num 0 saros 0 eclipse_type 0 gamma 0 magnitude 0 lat_ge 0 lng_ge 0 lat_dd_ge 0 lng_dd_ge 0 sun_alt 0 sun_azm 0 path_width 0 central_duration 0 duration_secs 0 cat_no 0 canon_plate 0 julian_date 0 t0 0 x0 0 x1 0 x2 0 x3 0 y0 0 y1 0 y2 0 y3 0 d0 0 d1 0 d2 0 mu0 0 mu1 0 mu2 0 l10 0 l11 0 l12 0 l20 0 l21 0 l22 0 tan_f1 0 tan_f2 0 tmin 0 tmax 0 etype 0 PNS 0 UNS 0 NCN 0 nSer 0 nSeq 0 nJLE 0 dtype: int64
data_kaggle.isnull().sum()
Catalog Number 0 Calendar Date 0 Eclipse Time 0 Delta T (s) 0 Lunation Number 0 Saros Number 0 Eclipse Type 0 Gamma 0 Eclipse Magnitude 0 Latitude 0 Longitude 0 Sun Altitude 0 Sun Azimuth 0 Path Width (km) 4200 Central Duration 4200 Date Time 0 Year 0 Month 0 Day 0 Visibility 0 Eclipse Latitude 0 Eclipse Longitude 0 obliquity 0 Geographical Hemisphere 0 Daytime/Nighttime 0 Sun Constellation 0 Inter-Eclipse Duration 0 Visibility Score 0 Eclipse Classification 0 Duration in Seconds 11898 Moon Distance (km) 0 Sun Distance (km) 0 Moon Angular Diameter (degrees) 0 Sun Angular Diameter (degrees) 0 Central Duration Seconds 4294 Normalized Duration 4294 Normalized Path Width 4381 EII 4381 Year Modulus 0 HEAS 0 Decade 0 Localized ESC 0 ESC Moving Average 9 ESC Wide-Scale Moving Average 804 Eclipse Interval 0 Cluster 0 Cluster 6 0 dtype: int64
def txt_to_secons(text_value):
minutes = int(text_value[:2])
secons = int(text_value[3:5])
return (60 * minutes) + secons
def month_num_to_text(num_value):
month = {
1: "January",
2: "February",
3: "March",
4: "April",
5: "May",
6: "June",
7: "July",
8: "August",
9: "September",
10: "October",
11: "November",
12: "December"
}
return month[num_value]
data_kaggle['Path Width (km)'] = data['path_width']
data_kaggle['Central Duration'] = data['central_duration']
data_kaggle['Central Duration Seconds'] = data['central_duration']
data_kaggle['Central Duration Seconds'] = data_kaggle['Central Duration Seconds'].apply(txt_to_secons)
data_kaggle['Month'] = data_kaggle['Month'].apply(month_num_to_text)
data_kaggle['Month'] = pd.Categorical(data_kaggle['Month'],categories=['January','February','March','April','May','June','July','August','September','October','November','December'],ordered=True)
data_kaggle['Duration in Seconds'] = data['duration_secs']
data_final = data_kaggle.drop(columns=['Catalog Number','Calendar Date','Eclipse Time','Date Time','Latitude','Longitude','Central Duration','Duration in Seconds','Cluster', 'Cluster 6'])
data_final.head()
| Delta T (s) | Lunation Number | Saros Number | Eclipse Type | Gamma | Eclipse Magnitude | Sun Altitude | Sun Azimuth | Path Width (km) | Year | ... | Normalized Duration | Normalized Path Width | EII | Year Modulus | HEAS | Decade | Localized ESC | ESC Moving Average | ESC Wide-Scale Moving Average | Eclipse Interval | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 46438 | -49456 | 5 | T | -0.2701 | 0.992601 | 74 | 344 | 246.6 | -1999 | ... | 0.534320 | 0.174066 | 0.662068 | 1999 | 0.333667 | -2000 | 1.556657 | NaN | NaN | 0.333333 |
| 1 | 46426 | -49450 | 10 | A | -0.2317 | 0.867659 | 76 | 21 | 235.9 | -1999 | ... | 0.543742 | 0.166314 | 0.608567 | 1999 | 0.333667 | -2000 | 1.556657 | NaN | NaN | 0.500000 |
| 2 | 46415 | -49444 | 15 | T | 0.4994 | 0.951077 | 60 | 151 | 110.8 | -1998 | ... | 0.181696 | 0.078224 | 0.498677 | 1998 | 0.334000 | -2000 | 1.792195 | NaN | NaN | 0.500000 |
| 3 | 46403 | -49438 | 20 | A | -0.9045 | 0.906871 | 25 | 74 | 162.4 | -1998 | ... | 0.099596 | 0.114165 | 0.389974 | 1998 | 0.334000 | -2000 | 1.792195 | NaN | NaN | 0.500000 |
| 4 | 46393 | -49433 | -13 | P | -1.4670 | 0.148987 | 0 | 281 | 0.0 | -1997 | ... | NaN | NaN | NaN | 1997 | 0.334333 | -2000 | 2.004286 | NaN | NaN | 0.472222 |
5 rows × 37 columns
data_final.isnull().sum()
Delta T (s) 0 Lunation Number 0 Saros Number 0 Eclipse Type 0 Gamma 0 Eclipse Magnitude 0 Sun Altitude 0 Sun Azimuth 0 Path Width (km) 0 Year 0 Month 0 Day 0 Visibility 0 Eclipse Latitude 0 Eclipse Longitude 0 obliquity 0 Geographical Hemisphere 0 Daytime/Nighttime 0 Sun Constellation 0 Inter-Eclipse Duration 0 Visibility Score 0 Eclipse Classification 0 Moon Distance (km) 0 Sun Distance (km) 0 Moon Angular Diameter (degrees) 0 Sun Angular Diameter (degrees) 0 Central Duration Seconds 0 Normalized Duration 4294 Normalized Path Width 4381 EII 4381 Year Modulus 0 HEAS 0 Decade 0 Localized ESC 0 ESC Moving Average 9 ESC Wide-Scale Moving Average 804 Eclipse Interval 0 dtype: int64
numeric_columns_with_nans = ['Normalized Duration', 'Normalized Path Width', 'EII', 'ESC Wide-Scale Moving Average', 'ESC Moving Average']
# Fill missing values with the median
for column in numeric_columns_with_nans:
median_value = data_final[column].median()
data_final[column] = data_final[column].fillna(median_value)
# Usaremos la mediana para completar los datos vacios para evitar la influencia de datos aislados.
data_final[numeric_columns_with_nans].isnull().sum()
Normalized Duration 0 Normalized Path Width 0 EII 0 ESC Wide-Scale Moving Average 0 ESC Moving Average 0 dtype: int64
data_final.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 11898 entries, 0 to 11897 Data columns (total 37 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Delta T (s) 11898 non-null int64 1 Lunation Number 11898 non-null int64 2 Saros Number 11898 non-null int64 3 Eclipse Type 11898 non-null object 4 Gamma 11898 non-null float64 5 Eclipse Magnitude 11898 non-null float64 6 Sun Altitude 11898 non-null int64 7 Sun Azimuth 11898 non-null int64 8 Path Width (km) 11898 non-null float64 9 Year 11898 non-null int64 10 Month 11898 non-null category 11 Day 11898 non-null int64 12 Visibility 11898 non-null object 13 Eclipse Latitude 11898 non-null float64 14 Eclipse Longitude 11898 non-null float64 15 obliquity 11898 non-null float64 16 Geographical Hemisphere 11898 non-null object 17 Daytime/Nighttime 11898 non-null object 18 Sun Constellation 11898 non-null object 19 Inter-Eclipse Duration 11898 non-null int64 20 Visibility Score 11898 non-null float64 21 Eclipse Classification 11898 non-null object 22 Moon Distance (km) 11898 non-null float64 23 Sun Distance (km) 11898 non-null float64 24 Moon Angular Diameter (degrees) 11898 non-null float64 25 Sun Angular Diameter (degrees) 11898 non-null float64 26 Central Duration Seconds 11898 non-null int64 27 Normalized Duration 11898 non-null float64 28 Normalized Path Width 11898 non-null float64 29 EII 11898 non-null float64 30 Year Modulus 11898 non-null int64 31 HEAS 11898 non-null float64 32 Decade 11898 non-null int64 33 Localized ESC 11898 non-null float64 34 ESC Moving Average 11898 non-null float64 35 ESC Wide-Scale Moving Average 11898 non-null float64 36 Eclipse Interval 11898 non-null float64 dtypes: category(1), float64(19), int64(11), object(6) memory usage: 3.3+ MB
data_final.head()
| Delta T (s) | Lunation Number | Saros Number | Eclipse Type | Gamma | Eclipse Magnitude | Sun Altitude | Sun Azimuth | Path Width (km) | Year | ... | Normalized Duration | Normalized Path Width | EII | Year Modulus | HEAS | Decade | Localized ESC | ESC Moving Average | ESC Wide-Scale Moving Average | Eclipse Interval | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 46438 | -49456 | 5 | T | -0.2701 | 0.992601 | 74 | 344 | 246.6 | -1999 | ... | 0.534320 | 0.174066 | 0.662068 | 1999 | 0.333667 | -2000 | 1.556657 | 1.955336 | 1.950866 | 0.333333 |
| 1 | 46426 | -49450 | 10 | A | -0.2317 | 0.867659 | 76 | 21 | 235.9 | -1999 | ... | 0.543742 | 0.166314 | 0.608567 | 1999 | 0.333667 | -2000 | 1.556657 | 1.955336 | 1.950866 | 0.500000 |
| 2 | 46415 | -49444 | 15 | T | 0.4994 | 0.951077 | 60 | 151 | 110.8 | -1998 | ... | 0.181696 | 0.078224 | 0.498677 | 1998 | 0.334000 | -2000 | 1.792195 | 1.955336 | 1.950866 | 0.500000 |
| 3 | 46403 | -49438 | 20 | A | -0.9045 | 0.906871 | 25 | 74 | 162.4 | -1998 | ... | 0.099596 | 0.114165 | 0.389974 | 1998 | 0.334000 | -2000 | 1.792195 | 1.955336 | 1.950866 | 0.500000 |
| 4 | 46393 | -49433 | -13 | P | -1.4670 | 0.148987 | 0 | 281 | 0.0 | -1997 | ... | 0.309556 | 0.131078 | 0.532898 | 1997 | 0.334333 | -2000 | 2.004286 | 1.955336 | 1.950866 | 0.472222 |
5 rows × 37 columns
data_final.columns
Index(['Delta T (s)', 'Lunation Number', 'Saros Number', 'Eclipse Type',
'Gamma', 'Eclipse Magnitude', 'Sun Altitude', 'Sun Azimuth',
'Path Width (km)', 'Year', 'Month', 'Day', 'Visibility',
'Eclipse Latitude', 'Eclipse Longitude', 'obliquity',
'Geographical Hemisphere', 'Daytime/Nighttime', 'Sun Constellation',
'Inter-Eclipse Duration', 'Visibility Score', 'Eclipse Classification',
'Moon Distance (km)', 'Sun Distance (km)',
'Moon Angular Diameter (degrees)', 'Sun Angular Diameter (degrees)',
'Central Duration Seconds', 'Normalized Duration',
'Normalized Path Width', 'EII', 'Year Modulus', 'HEAS', 'Decade',
'Localized ESC', 'ESC Moving Average', 'ESC Wide-Scale Moving Average',
'Eclipse Interval'],
dtype='object')
data_final.to_csv('data.csv', sep=',')
import matplotlib.pyplot as plt
import plotly.express as px
import seaborn as sns
# Correlation Analysis
eclipse_features = data_final[['Delta T (s)', 'Lunation Number', 'Saros Number',
'Gamma', 'Eclipse Magnitude', 'Sun Altitude',
'Sun Azimuth', 'Path Width (km)', 'Year',
'Day', 'Eclipse Latitude', 'Eclipse Longitude',
'obliquity', 'Inter-Eclipse Duration', 'Visibility Score',
'Moon Distance (km)', 'Sun Distance (km)',
'Moon Angular Diameter (degrees)', 'Sun Angular Diameter (degrees)',
'Central Duration Seconds', 'Normalized Duration',
'Normalized Path Width', 'EII', 'Year Modulus',
'HEAS', 'Decade', 'Localized ESC', 'ESC Moving Average',
'ESC Wide-Scale Moving Average', 'Eclipse Interval']]
correlation_matrix = eclipse_features.corr()
plt.figure(figsize=(25, 25))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f")
plt.title('Correlation Matrix of Eclipse Features')
plt.show()
# This heatmap provides insights into how different eclipse parameters are interrelated. For instance, it can show whether the magnitude of an eclipse is related to its geographic latitude.
# Geospatial Distribution of Eclipses
# Plotting the latitude and longitude to see the distribution of eclipse paths
plt.figure(figsize=(12, 6))
sns.scatterplot(data=data_final, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type', style='Eclipse Type')
plt.title('Distribucion Geografica de los Eclipses')
plt.xlabel('Longitud')
plt.ylabel('Latitud')
plt.grid(True)
plt.show()
# This visualization helps in understanding where most eclipses are visible from and highlights any geographical patterns or anomalies.
# Temporal Distribution of Eclipses
# Frequency of eclipses by year
eclipse_counts = data_final['Year'].value_counts().sort_index()
plt.figure(figsize=(12, 6))
eclipse_counts.plot(kind='line')
plt.title('Frecuencia de Eclipses a lo Largo del Tiempo')
plt.xlabel('Años')
plt.ylabel('Numero de Eclipses')
plt.grid(True)
plt.show()
# This plot will reveal any long-term trends or cyclic patterns in eclipse occurrences, which are important for understanding periodic astronomical phenomena.
displot = sns.displot(data_final, x="Eclipse Type", shrink=.8)
plt.title('Distribucion de Tipos de Eclipse')
plt.xlabel('Tipo de Eclipse')
plt.ylabel('Cantidad')
# Improving overall aesthetics with seaborn's despine function to remove top and right borders
sns.despine()
plt.show()
data2_filtrado = data_final[(data_final['Decade'] >= 2000) & (data_final['Decade'] <= 3000)]
fig = px.scatter_geo(data2_filtrado,
lat='Eclipse Latitude',
lon='Eclipse Longitude',
# color='Eclipse Magnitude', # Variable para el mapa de colores
color='EII', # Variable para el mapa de colores
hover_name='Eclipse Magnitude', # Información adicional al pasar el mouse
title='Mapa de Influencia de Eclipses',
# projection='natural earth', # Tipo de proyección del mapa
color_continuous_scale=px.colors.sequential.Plasma) # Escala de colores
# Ajustando el layout para centrar el título
fig.update_layout(
title={
'text': 'Mapa de Influencia de Eclipses',
'y':0.9,
'x':0.5,
'xanchor': 'center',
'yanchor': 'top'
}
)
fig.show()
fig = px.density_mapbox(data2_filtrado,
lat='Eclipse Latitude',
lon='Eclipse Longitude',
z = 'EII',
radius = 8,
zoom = 2,
mapbox_style = 'open-street-map')
fig.show()
plt.figure(figsize=(14, 7))
lineplot = sns.lineplot(
x='Decade',
y='Eclipse Magnitude',
data=data_final.sort_values('Decade'),
marker='o', # Adds markers to each data point
linestyle='-', # Solid line
color='royalblue', # Line color
linewidth=1.5 # Line width
)
plt.title('Trend of Eclipse Magnitude Over Decades')
plt.xlabel('Decade')
plt.ylabel('Eclipse Magnitude')
plt.xticks(fontsize=12, rotation=45)
plt.yticks(fontsize=12)
# Highlighting specific points or trends if needed (e.g., highest magnitude)
plt.scatter(
x=data_final.loc[data_final['Eclipse Magnitude'].idxmax(), 'Decade'],
y=data_final['Eclipse Magnitude'].max(),
color='red',
s=50, # Size of the scatter point
label='Highest Magnitude',
zorder=5 # Ensures the point is on top
)
plt.legend()
# Improving overall aesthetics with seaborn's despine function to remove top and right borders
sns.despine()
plt.show()
df_aggregated = data_final.groupby('Decade')['Eclipse Magnitude'].mean().reset_index()
df_aggregated.head()
| Decade | Eclipse Magnitude | |
|---|---|---|
| 0 | -2000 | 0.732985 |
| 1 | -1990 | 0.777043 |
| 2 | -1980 | 0.733739 |
| 3 | -1970 | 0.735793 |
| 4 | -1960 | 0.791861 |
# Creating the interactive line plot using Plotly Express
fig = px.line(
df_aggregated,
x='Decade',
y='Eclipse Magnitude',
title='Average Trend of Eclipse Magnitude Over Decades',
labels={'Decade': 'Decade', 'Eclipse Magnitude': 'Average Eclipse Magnitude'}
)
fig.update_layout(
title={'text': "Average Trend of Eclipse Magnitude Over Decades", 'y':0.95, 'x':0.5, 'xanchor': 'center', 'yanchor': 'top'},
hovermode='x unified'
)
# Optimizing marker visibility for large datasets
fig.update_traces(
line=dict(width=2, color='Blue'),
marker=dict(size=4, color='LightSkyBlue', line=dict(width=1, color='DarkSlateGrey')),
hovertemplate="Decade: %{x}<br>Avg. Eclipse Magnitude: %{y:.2f}<extra></extra>"
)
fig.show()
df_aggregated_2 = data_final.groupby('Decade')['EII'].mean().reset_index()
df_aggregated_2.head()
| Decade | EII | |
|---|---|---|
| 0 | -2000 | 0.529093 |
| 1 | -1990 | 0.539257 |
| 2 | -1980 | 0.509304 |
| 3 | -1970 | 0.541058 |
| 4 | -1960 | 0.524778 |
# Creating the interactive line plot using Plotly Express
fig = px.line(
df_aggregated_2,
x='Decade',
y='EII',
title='Average Trend of Influence Index Over Decades',
labels={'Decade': 'Decade', 'EII': 'Average Influence Index'}
)
fig.update_layout(
title={'text': "Average Trend of Influence Index Over Decades", 'y':0.95, 'x':0.5, 'xanchor': 'center', 'yanchor': 'top'},
hovermode='x unified'
)
# Optimizing marker visibility for large datasets
fig.update_traces(
line=dict(width=2, color='Green'),
marker=dict(size=4, color='LightSkyBlue', line=dict(width=1, color='DarkSlateGrey')),
hovertemplate="Decade: %{x}<br>Avg. Influence Index: %{y:.2f}<extra></extra>"
)
# Display the plot
fig.show()
plt.figure(figsize=(10, 6))
sns.scatterplot(data=data_final, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
plt.title('Eclipse Magnitude vs. Gamma by Type')
plt.xlabel('Gamma')
plt.ylabel('Eclipse Magnitude')
plt.grid(True)
plt.show()
# This scatter plot helps to visualize how the eclipse magnitude is related to the gamma value, differentiated by the type of eclipse.
Como afecta el valor de Gamma en la magnitud del Eclipse según cada Tipo por separado
tipos_filtrados_t = data_final[data_final['Eclipse Type'].str.contains('T', na=False)]
tipos_filtrados_a = data_final[data_final['Eclipse Type'].str.contains('A', na=False)]
tipos_filtrados_h = data_final[data_final['Eclipse Type'].str.contains('H', na=False)]
tipos_filtrados_p = data_final[data_final['Eclipse Type'].str.contains('P', na=False)]
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10)) # Ajusta el tamaño como necesites
# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
axs[0, 0].set_title('Eclipse Magnitude vs. Gamma by Type T')
axs[0, 0].set_xlabel('Gamma')
axs[0, 0].set_ylabel('Eclipse Magnitude')
axs[0, 0].grid(True)
sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
axs[0, 1].set_title('Eclipse Magnitude vs. Gamma by Type A')
axs[0, 1].set_xlabel('Gamma')
axs[0, 1].set_ylabel('Eclipse Magnitude')
axs[0, 1].grid(True)
sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
axs[1, 0].set_title('Eclipse Magnitude vs. Gamma by Type P')
axs[1, 0].set_xlabel('Gamma')
axs[1, 0].set_ylabel('Eclipse Magnitude')
axs[1, 0].grid(True)
sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Gamma', y='Eclipse Magnitude', hue='Eclipse Type')
axs[1, 1].set_title('Eclipse Magnitude vs. Gamma by Type H')
axs[1, 1].set_xlabel('Gamma')
axs[1, 1].set_ylabel('Eclipse Magnitude')
axs[1, 1].grid(True)
# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()
plt.show()
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10)) # Ajusta el tamaño como necesites
# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Decade', y='Saros Number', hue='Eclipse Type')
axs[0, 0].set_title('Saros Number vs. Decade by Type T')
axs[0, 0].set_xlabel('Decade')
axs[0, 0].set_ylabel('Saros Number')
axs[0, 0].grid(True)
sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Decade', y='Saros Number', hue='Eclipse Type')
axs[0, 1].set_title('Saros Number vs. Decade by Type A')
axs[0, 1].set_xlabel('Decade')
axs[0, 1].set_ylabel('Saros Number')
axs[0, 1].grid(True)
sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Decade', y='Saros Number', hue='Eclipse Type')
axs[1, 0].set_title('Saros Number vs. Decade by Type P')
axs[1, 0].set_xlabel('Decade')
axs[1, 0].set_ylabel('Saros Number')
axs[1, 0].grid(True)
sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Decade', y='Saros Number', hue='Eclipse Type')
axs[1, 1].set_title('Saros Number vs. Decade by Type H')
axs[1, 1].set_xlabel('Decade')
axs[1, 1].set_ylabel('Saros Number')
axs[1, 1].grid(True)
# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()
# Mostrar la figura completa
plt.show()
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10)) # Ajusta el tamaño como necesites
# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type')
axs[0, 0].set_title('Latitude vs. Longitude by Type T')
axs[0, 0].set_xlabel('Longitude')
axs[0, 0].set_ylabel('Latitude')
axs[0, 0].grid(True)
sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type')
axs[0, 1].set_title('Latitude vs. Longitude by Type A')
axs[0, 1].set_xlabel('Longitude')
axs[0, 1].set_ylabel('Latitude')
axs[0, 1].grid(True)
sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type')
axs[1, 0].set_title('Latitude vs. Longitude by Type P')
axs[1, 0].set_xlabel('Longitude')
axs[1, 0].set_ylabel('Latitude')
axs[1, 0].grid(True)
sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Eclipse Longitude', y='Eclipse Latitude', hue='Eclipse Type')
axs[1, 1].set_title('Latitude vs. Longitude by Type H')
axs[1, 1].set_xlabel('Longitude')
axs[1, 1].set_ylabel('Latitude')
axs[1, 1].grid(True)
# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()
# Mostrar la figura completa
plt.show()
Mostrar la variación de HEAS(puntuación de alineación del eclipse con significancia histórica) por Year según el Tipo de Eclipse
# Crear una figura con subplots
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10))
# Configurar el color del fondo de la figura
fig.patch.set_facecolor('lightgray') # Cambia 'lightgray' al color deseado para el fondo de la figura
# Primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Year', y='HEAS', hue='Eclipse Type', s=25)
axs[0, 0].set_title('Year vs. HEAS by Type T')
axs[0, 0].set_xlabel('Year')
axs[0, 0].set_ylabel('HEAS')
axs[0, 0].grid(True)
axs[0, 0].set_facecolor('#f0f0f0') # Cambia el fondo del área de trazado
# Segundo subplot
sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Year', y='HEAS', hue='Eclipse Type', s=25)
axs[0, 1].set_title('Year vs. HEAS by Type A')
axs[0, 1].set_xlabel('Year')
axs[0, 1].set_ylabel('HEAS')
axs[0, 1].grid(True)
axs[0, 1].set_facecolor('#f0f0f0') # Cambia el fondo del área de trazado
# Tercer subplot
sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Year', y='HEAS', hue='Eclipse Type', s=25)
axs[1, 0].set_title('Year vs. HEAS by Type P')
axs[1, 0].set_xlabel('Year')
axs[1, 0].set_ylabel('HEAS')
axs[1, 0].grid(True)
axs[1, 0].set_facecolor('#f0f0f0') # Cambia el fondo del área de trazado
# Cuarto subplot
sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Year', y='HEAS', hue='Eclipse Type', s=25)
axs[1, 1].set_title('Year vs. HEAS by Type H')
axs[1, 1].set_xlabel('Year')
axs[1, 1].set_ylabel('HEAS')
axs[1, 1].grid(True)
axs[1, 1].set_facecolor('#f0f0f0') # Cambia el fondo del área de trazado
# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()
# Mostrar la figura completa
plt.show()
sns.histplot(x = "Eclipse Type", hue = "Daytime/Nighttime", data = data_final, multiple = "dodge", shrink=0.8)
sns.despine()
plt.show()
# Agrupar y contar las ocurrencias de cada tipo de eclipse dentro de cada grupo de Daytime/Nighttime
series = data_final.groupby(['Daytime/Nighttime','Eclipse Type'])['Eclipse Type'].count()
# Convertir la Serie agrupada en un DataFrame y hacer un unstack para preparar para idxmax
grouped_df = series.unstack(fill_value=0)
grouped_df
| Eclipse Type | A | A+ | A- | Am | An | As | H | H2 | H3 | Hm | P | Pb | Pe | T | T+ | T- | Tm | Tn | Ts |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Daytime/Nighttime | |||||||||||||||||||
| Daytime | 2786 | 20 | 21 | 55 | 23 | 12 | 364 | 20 | 16 | 12 | 1936 | 87 | 77 | 2252 | 2 | 7 | 57 | 7 | 6 |
| Nighttime | 969 | 14 | 13 | 17 | 13 | 13 | 138 | 4 | 10 | 5 | 1939 | 76 | 85 | 797 | 7 | 10 | 15 | 7 | 6 |
# Encontrar el tipo de eclipse con el mayor número de ocurrencias para cada Daytime/Nighttime
most_frequent_type = grouped_df.idxmax(axis=1)
most_frequent_type
Daytime/Nighttime Daytime A Nighttime P dtype: object
Mostrar que tan visible es un eclipse según el Indice de Influenza(EII) por Tipo de eclipse
# Visibility Score: que tan visible o extenso es el eclipse
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10)) # Ajusta el tamaño como necesites
# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Visibility Score', y='EII', hue='Eclipse Type', s=25)
axs[0, 0].set_title('Visibility Score vs EII by Type T')
axs[0, 0].set_xlabel('Visibility Score')
axs[0, 0].set_ylabel('EII')
axs[0, 0].grid(True)
sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Visibility Score', y='EII', hue='Eclipse Type', s=25)
axs[0, 1].set_title('Visibility Score vs EII by Type A')
axs[0, 1].set_xlabel('Visibility Score')
axs[0, 1].set_ylabel('EII')
axs[0, 1].grid(True)
sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Visibility Score', y='EII', hue='Eclipse Type', s=25)
axs[1, 0].set_title('Visibility Score vs EII by Type P')
axs[1, 0].set_xlabel('Visibility Score')
axs[1, 0].set_ylabel('EII')
axs[1, 0].grid(True)
sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Visibility Score', y='EII', hue='Eclipse Type', s=25)
axs[1, 1].set_title('Visibility Score vs EII by Type H')
axs[1, 1].set_xlabel('Visibility Score')
axs[1, 1].set_ylabel('EII')
axs[1, 1].grid(True)
# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()
# Mostrar la figura completa
plt.show()
plt.figure(figsize=(10, 6))
sns.histplot(x = "Eclipse Type", hue = "Geographical Hemisphere", data = data_final, multiple = "dodge", shrink=0.8)
sns.despine()
plt.show()
# Agrupar y contar las ocurrencias de cada tipo de eclipse dentro de cada grupo de Daytime/Nighttime
series = data_final.groupby(['Geographical Hemisphere','Eclipse Type'])['Eclipse Type'].count()
# Convertir la Serie agrupada en un DataFrame y hacer un unstack para preparar para idxmax
grouped_df = series.unstack(fill_value=0)
# Encontrar el tipo de eclipse con el mayor número de ocurrencias para cada Daytime/Nighttime
most_frequent_type = grouped_df.idxmax(axis=1)
most_frequent_type
Geographical Hemisphere N E P N W P S E A S W P dtype: object
data_final['Saros Number'].value_counts()
Saros Number
34 86
52 86
51 85
32 84
53 84
..
188 7
189 5
-12 4
-13 2
190 1
Name: count, Length: 204, dtype: int64
lineplot = sns.lineplot(
y='Gamma',
x='Saros Number',
data=data_final.sort_values('Saros Number'),
marker='o', # Adds markers to each data point
linestyle='-', # Solid line
color='royalblue', # Line color
linewidth=1.5 # Line width
)
lineplot = sns.scatterplot(
x='Eclipse Latitude',
y='Gamma',
# y='Eclipse Latitude',
data=data_final,
hue='Eclipse Type'
)
# Mover la leyenda a un costado del gráfico
plt.legend(loc='upper right', bbox_to_anchor=(1.05, 1), borderaxespad=0)
plt.show()
tipos_filtrados_t = data_final[data_final['Eclipse Type'].str.contains('T', na=False)]
tipos_filtrados_a = data_final[data_final['Eclipse Type'].str.contains('A', na=False)]
tipos_filtrados_h = data_final[data_final['Eclipse Type'].str.contains('H', na=False)]
tipos_filtrados_p = data_final[data_final['Eclipse Type'].str.contains('P', na=False)]
# Crear una figura con subplots
# Ajusta (nrows, ncols) dependiendo de cómo quieras organizar los gráficos
fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(14, 10)) # Ajusta el tamaño como necesites
# Scatter plot para el primer tipo en el primer subplot
sns.scatterplot(ax=axs[0, 0], data=tipos_filtrados_t, x='Eclipse Latitude', y='Gamma', hue='Eclipse Type')
axs[0, 0].set_title('Gamma vs Eclipse Latitude by Type T')
axs[0, 0].set_xlabel('Gamma')
axs[0, 0].set_ylabel('Eclipse Latitude')
axs[0, 0].grid(True)
sns.scatterplot(ax=axs[0, 1], data=tipos_filtrados_a, x='Eclipse Latitude', y='Gamma', hue='Eclipse Type')
axs[0, 1].set_title('Gamma vs Eclipse Latitude by Type A')
axs[0, 1].set_xlabel('Gamma')
axs[0, 1].set_ylabel('Eclipse Latitude')
axs[0, 1].grid(True)
sns.scatterplot(ax=axs[1, 0], data=tipos_filtrados_p, x='Eclipse Latitude', y='Gamma', hue='Eclipse Type')
axs[1, 0].set_title('Gamma vs Eclipse Latitude by Type P')
axs[1, 0].set_xlabel('Gamma')
axs[1, 0].set_ylabel('Eclipse Latitude')
axs[1, 0].grid(True)
sns.scatterplot(ax=axs[1, 1], data=tipos_filtrados_h, x='Eclipse Latitude', y='Gamma', hue='Eclipse Type')
axs[1, 1].set_title('Gamma vs Eclipse Latitude by Type H')
axs[1, 1].set_xlabel('Gamma')
axs[1, 1].set_ylabel('Eclipse Latitude')
axs[1, 1].grid(True)
# Ajustar el layout para evitar la superposición de etiquetas y títulos
plt.tight_layout()
# Mostrar la figura completa
plt.show()
lineplot = sns.lineplot(
x='Sun Altitude',
y='Visibility Score',
data=data_final[data_final['Sun Altitude'] != 0],
marker='o', # Adds markers to each data point
linestyle='-', # Solid line
color='royalblue', # Line color
linewidth=1.5 # Line width
)
data_month_grouped = pd.DataFrame(data_final.groupby('Month')['Eclipse Type'].count().reset_index(name='Count'))
sns.set_style("whitegrid")
plt.figure(figsize=(6,6))
plt.pie(data_month_grouped['Count'], labels=data_month_grouped['Month'], autopct='%.2f%%')
plt.title('Distribucion de eclipses por mes')
plt.show()
# Creating the line plot with a more appealing aesthetic
lineplot = sns.lineplot(
x='Sun Altitude',
y='Eclipse Magnitude',
data=data_final[data_final['Sun Altitude'] != 0],
marker='o', # Adds markers to each data point
linestyle='-', # Solid line
color='royalblue', # Line color
linewidth=1.5 # Line width
)
df_aggregated = data_final[data_final['Sun Altitude'] != 0].groupby('Sun Altitude')['Eclipse Magnitude'].mean().reset_index()
# Creating the interactive line plot using Plotly Express
fig = px.line(
df_aggregated,
x='Sun Altitude',
y='Eclipse Magnitude',
title='Average Trend of Eclipse Magnitude Over Sun Altitude',
labels={'Sun Altitude': 'Sun Altitude', 'Eclipse Magnitude': 'Average Eclipse Magnitude'}
)
fig.update_layout(
title={'text': "Average Trend of Eclipse Magnitude Over Sun Altitude", 'y':0.95, 'x':0.5, 'xanchor': 'center', 'yanchor': 'top'},
hovermode='x unified'
)
# Optimizing marker visibility for large datasets
fig.update_traces(
line=dict(width=2, color='Orange'),
marker=dict(size=4, color='LightSkyBlue', line=dict(width=1, color='DarkSlateGrey')),
hovertemplate="Sun Altitude: %{x}<br>Avg. Eclipse Magnitude: %{y:.2f}<extra></extra>"
)
# Display the plot
fig.show()
lineplot = sns.lineplot(
x='Month',
y='Eclipse Magnitude',
data=data_final,
marker='o', # Adds markers to each data point
linestyle='-', # Solid line
color='orange', # Line color
linewidth=1.5 # Line width
)
plt.xticks(fontsize=12, rotation=90)
plt.yticks(fontsize=12)
plt.show()
!pip -q install nbconvert
!ls /content
data.csv eclipse_data_enriched_5000_years.csv sample_data
!jupyter nbconvert --to html /content/Eclipse_limpieza_datos.ipynb
[NbConvertApp] WARNING | pattern '/content/Eclipse_limpieza_datos.ipynb' matched no files
This application is used to convert notebook files (*.ipynb)
to various other formats.
WARNING: THE COMMANDLINE INTERFACE MAY CHANGE IN FUTURE RELEASES.
Options
=======
The options below are convenience aliases to configurable class-options,
as listed in the "Equivalent to" description-line of the aliases.
To see all configurable class-options for some <cmd>, use:
<cmd> --help-all
--debug
set log level to logging.DEBUG (maximize logging output)
Equivalent to: [--Application.log_level=10]
--show-config
Show the application's configuration (human-readable format)
Equivalent to: [--Application.show_config=True]
--show-config-json
Show the application's configuration (json format)
Equivalent to: [--Application.show_config_json=True]
--generate-config
generate default config file
Equivalent to: [--JupyterApp.generate_config=True]
-y
Answer yes to any questions instead of prompting.
Equivalent to: [--JupyterApp.answer_yes=True]
--execute
Execute the notebook prior to export.
Equivalent to: [--ExecutePreprocessor.enabled=True]
--allow-errors
Continue notebook execution even if one of the cells throws an error and include the error message in the cell output (the default behaviour is to abort conversion). This flag is only relevant if '--execute' was specified, too.
Equivalent to: [--ExecutePreprocessor.allow_errors=True]
--stdin
read a single notebook file from stdin. Write the resulting notebook with default basename 'notebook.*'
Equivalent to: [--NbConvertApp.from_stdin=True]
--stdout
Write notebook output to stdout instead of files.
Equivalent to: [--NbConvertApp.writer_class=StdoutWriter]
--inplace
Run nbconvert in place, overwriting the existing notebook (only
relevant when converting to notebook format)
Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory=]
--clear-output
Clear output of current file and save in place,
overwriting the existing notebook.
Equivalent to: [--NbConvertApp.use_output_suffix=False --NbConvertApp.export_format=notebook --FilesWriter.build_directory= --ClearOutputPreprocessor.enabled=True]
--no-prompt
Exclude input and output prompts from converted document.
Equivalent to: [--TemplateExporter.exclude_input_prompt=True --TemplateExporter.exclude_output_prompt=True]
--no-input
Exclude input cells and output prompts from converted document.
This mode is ideal for generating code-free reports.
Equivalent to: [--TemplateExporter.exclude_output_prompt=True --TemplateExporter.exclude_input=True --TemplateExporter.exclude_input_prompt=True]
--allow-chromium-download
Whether to allow downloading chromium if no suitable version is found on the system.
Equivalent to: [--WebPDFExporter.allow_chromium_download=True]
--disable-chromium-sandbox
Disable chromium security sandbox when converting to PDF..
Equivalent to: [--WebPDFExporter.disable_sandbox=True]
--show-input
Shows code input. This flag is only useful for dejavu users.
Equivalent to: [--TemplateExporter.exclude_input=False]
--embed-images
Embed the images as base64 dataurls in the output. This flag is only useful for the HTML/WebPDF/Slides exports.
Equivalent to: [--HTMLExporter.embed_images=True]
--sanitize-html
Whether the HTML in Markdown cells and cell outputs should be sanitized..
Equivalent to: [--HTMLExporter.sanitize_html=True]
--log-level=<Enum>
Set the log level by value or name.
Choices: any of [0, 10, 20, 30, 40, 50, 'DEBUG', 'INFO', 'WARN', 'ERROR', 'CRITICAL']
Default: 30
Equivalent to: [--Application.log_level]
--config=<Unicode>
Full path of a config file.
Default: ''
Equivalent to: [--JupyterApp.config_file]
--to=<Unicode>
The export format to be used, either one of the built-in formats
['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'rst', 'script', 'slides', 'webpdf']
or a dotted object name that represents the import path for an
``Exporter`` class
Default: ''
Equivalent to: [--NbConvertApp.export_format]
--template=<Unicode>
Name of the template to use
Default: ''
Equivalent to: [--TemplateExporter.template_name]
--template-file=<Unicode>
Name of the template file to use
Default: None
Equivalent to: [--TemplateExporter.template_file]
--theme=<Unicode>
Template specific theme(e.g. the name of a JupyterLab CSS theme distributed
as prebuilt extension for the lab template)
Default: 'light'
Equivalent to: [--HTMLExporter.theme]
--sanitize_html=<Bool>
Whether the HTML in Markdown cells and cell outputs should be sanitized.This
should be set to True by nbviewer or similar tools.
Default: False
Equivalent to: [--HTMLExporter.sanitize_html]
--writer=<DottedObjectName>
Writer class used to write the
results of the conversion
Default: 'FilesWriter'
Equivalent to: [--NbConvertApp.writer_class]
--post=<DottedOrNone>
PostProcessor class used to write the
results of the conversion
Default: ''
Equivalent to: [--NbConvertApp.postprocessor_class]
--output=<Unicode>
overwrite base name use for output files.
can only be used when converting one notebook at a time.
Default: ''
Equivalent to: [--NbConvertApp.output_base]
--output-dir=<Unicode>
Directory to write output(s) to. Defaults
to output to the directory of each notebook. To recover
previous default behaviour (outputting to the current
working directory) use . as the flag value.
Default: ''
Equivalent to: [--FilesWriter.build_directory]
--reveal-prefix=<Unicode>
The URL prefix for reveal.js (version 3.x).
This defaults to the reveal CDN, but can be any url pointing to a copy
of reveal.js.
For speaker notes to work, this must be a relative path to a local
copy of reveal.js: e.g., "reveal.js".
If a relative path is given, it must be a subdirectory of the
current directory (from which the server is run).
See the usage documentation
(https://nbconvert.readthedocs.io/en/latest/usage.html#reveal-js-html-slideshow)
for more details.
Default: ''
Equivalent to: [--SlidesExporter.reveal_url_prefix]
--nbformat=<Enum>
The nbformat version to write.
Use this to downgrade notebooks.
Choices: any of [1, 2, 3, 4]
Default: 4
Equivalent to: [--NotebookExporter.nbformat_version]
Examples
--------
The simplest way to use nbconvert is
> jupyter nbconvert mynotebook.ipynb --to html
Options include ['asciidoc', 'custom', 'html', 'latex', 'markdown', 'notebook', 'pdf', 'python', 'rst', 'script', 'slides', 'webpdf'].
> jupyter nbconvert --to latex mynotebook.ipynb
Both HTML and LaTeX support multiple output templates. LaTeX includes
'base', 'article' and 'report'. HTML includes 'basic', 'lab' and
'classic'. You can specify the flavor of the format used.
> jupyter nbconvert --to html --template lab mynotebook.ipynb
You can also pipe the output to stdout, rather than a file
> jupyter nbconvert mynotebook.ipynb --stdout
PDF is generated via latex
> jupyter nbconvert mynotebook.ipynb --to pdf
You can get (and serve) a Reveal.js-powered slideshow
> jupyter nbconvert myslides.ipynb --to slides --post serve
Multiple notebooks can be given at the command line in a couple of
different ways:
> jupyter nbconvert notebook*.ipynb
> jupyter nbconvert notebook1.ipynb notebook2.ipynb
or you can specify the notebooks list in a config file, containing::
c.NbConvertApp.notebooks = ["my_notebook.ipynb"]
> jupyter nbconvert --config mycfg.py
To see all available configurables, use `--help-all`.